library(flowWorkspace)
library(ggcyto)
library(openCyto)Gating Cells in cytoverse - Part 1
Background
The purpose of flow cytometry is to make inferences regarding some cell type(s) of interest. Often this involves establishing/drawing a gating hierarchy to sequentially filter down to the cell type(s) of interest. This process is very often done manually, and can be very labour intensive. Importantly, manual approach implies large variation when one person does it vs. the next, or even if the same person does it multiple times. The cytoverse offers a suite of tools to tackle this problem in a reproducible and programmatic manner.
GatingSet and GatingHeirarchy
In the previous sections, we have seen (and worked with) cytoframe, and cytoset. These objects hold the underlying data, allowing us to visualize it, manipulate it, etc.
In this section we work with GatingSet and GatingHierarchy. These objects, like the name suggests, store information regarding various gates and filters that we will generate. Importantly, we can save the GatingSet which will completely package the analysis as well as the .fcs files in an opensource format that can be shared, allowing reproducibility.
The Gating sections of this workshop is designed as follows: - In part 1 of this section, we will go over variety of native methods that exist in the cytoverse which analysts can utilize to gate their cells of interest. - In part 2 of this sections, we will go over methods to extract gated data for cells/populations of interest. - In part 3 (optional), we will demonstrate how analysts can automate this process by utilizing a gatingTemplate (a csv file that can be used to define and build the hierarchy).
Creating a GatingSet
Required libraries
To create a GatingSet, first load in a cytoset.
# load cytoset
cs <- load_cytoset_from_fcs(path = "data/fcs-data/", pattern = "*TNK*")
set.seed(20230725)
metadata <- data.frame(name = sampleNames(cs),
status = "Healthy",
Treatment = sample(x = c("Treated","Control"),
size = length(cs),
prob = c(0.5,0.5),replace = TRUE),
panel = ifelse(
grepl(pattern = "TNK",x = sampleNames(cs)),
"T Cell Panel","Myeloid Panel"),
row.names = sampleNames(cs))
pData(cs) <- metadata
# creating a GatingSet
gs <- flowWorkspace::GatingSet(cs)
gsA GatingSet with 4 samples
We have now created a GatingSet called gs.
Exercise
- How would you check the metadata associated with the
GatingSet? - How would you subset
gsto only include samples where Treatment = Control?
The GatingSet needs to be compensated and transformed before we start attaching gates.
Note: The approach to compensate and transform a GatingSet is similar to that of cytoframe and cytoset.
We can directly compensate the GatingSet object.
# compensate a GatingSet
spill <- keyword(gs[[1]],"$SPILLOVER") # extract spillover matrix stored within the file
gs <- compensate(gs,spill) # GatingSet will be compensated and stores the spill matrix as well
recompute(gs) # update the gs
# retrieve compensation information
gs_spill <- gs_get_compensations(gs[[1]])
# output is a rich spillover information
slot(gs_spill[[1]],"spillover")[1:4, 1:4] B515-A B610-A B660-A B710-A
B515-A 1.000000000 0.07573059 0.04266579 0.003687143
B610-A 0.009777704 1.00000000 0.78589800 0.074328487
B660-A 0.005482375 0.03108691 1.00000000 0.112570733
B710-A 0.024698965 0.12699130 0.90643808 1.000000000
As well we can transform the GatingSet.
For convenience, we are using a previously defined transformerList object. The transformerList object was extracted from the workspace file created by the authors of this study and can be found here.
# transformation of GatingSet
my_trans <- readRDS(file = "data/fj-wsp/fj_transform.gz")
# make into transformerList object
my_trans_list <- flowWorkspace::transformerList(
from = names(my_trans),
trans = my_trans
)
gs <- flowWorkspace::transform(gs, my_trans_list) # transforms underlying dataExercise
- How would you be able to tell if a
GatingSethas been compensated? - How would you be able to tell if a
GatingSethas been transformed? Hint: Tryhelp.search("get_transformation"). - Why might it be beneficial to compensate and transform a
GatingSetrather than acytoframeor acytoset?
A note about GatingHierarchy
You may have noticed above that some function calls have gs_ and others have gh_.
gs_ indicates that GatingSet while gh_ indicates a GatingHierarchy.
The GatingHierarchy is a data structure that stores the sample-wise gating information present in a GatingSet. In essence a GatingSet is a collection of GatingHierarchy.
You can access the GatingHierarchy using the [[]] subset operation.
Exercise
- Try running
gh_get_compensations(gs). Why do you get an error message? How would you fix this? - What other methods have
gs_andgh_? - What is the difference between
gs[1]andgs[[1]]?
Adding gates and building a hierarchy
On this GatingSet, we will add various gates that identifies cells of interest. Each sample within a GatingSet is associated with a GatingHierarchy that stores information regarding various gates that we would have created and applied to the samples.
Below, we demonstrate various types of gates that could be estimated automatically or be defined programmatically which can be applied to the samples within a GatingSet to build a hierarchy.
Note: This process is informed by visualization of the data. As such, we will make ample use of the ggcyto library.
Automatic estimation of gates
The gs_add_gating_method call from openCyto library can be leveraged for automatic estimation of gates, as well as clean scripting! Available gating methods can be listed by calling gt_list_methods(). Please see additional examples here: cytoverse.
List of available methods.
Gating Functions:
=== quantileGate
=== gate_quantile
=== rangeGate
=== flowClust.2d
=== gate_flowclust_2d
=== mindensity
=== gate_mindensity
=== mindensity2
=== gate_mindensity2
=== cytokine
=== flowClust.1d
=== gate_flowclust_1d
=== boundary
=== singletGate
=== quadGate.tmix
=== gate_quad_tmix
=== quadGate.seq
=== gate_quad_sequential
Preprocessing Functions:
=== prior_flowClust
=== prior_flowClust
=== warpSet
=== standardize_flowset
Singlets gate
A singlets gate is mostly used for data clean up, by removing out doublets.
# visualization
singlet_vis <- autoplot(
gs_pop_get_data(gs,"root"), # gs_pop_get_data(gs, node) extracts the underlying data
x = "FSC-A",
y = "FSC-H",
bins = 256
) +
facet_wrap(~sampleNames(gs)
) # leverage the metadata that is saved within the GatingSet to facet plots
singlet_visNow, we estimate and add!
# estimate and add
gs_add_gating_method(
gs = gs, # gatingset
alias = "singlets", # name given to the population
pop = "+", # indicate whether events inside or outside the gate should be filtered
parent = "root", # where to attach this node
dims = "FSC-A,FSC-H", # dimensions used to estimate the gate
gating_method = "singletGate", # one of the available gating methods
gating_args = "wider_gate = FALSE", # arguments passed to singletGate
)
# visualize
singlet_vis +
geom_gate(
gs_pop_get_gate(
gs,
"singlets"
)
)+
facet_wrap(~name)Estimating cutpoint and ranges using gate_quantile and gate_mindensity2
gate_quantile and gate_mindensity2 can be used to estimate a cut point in the data, allowing the gating of data to the left(-) or the right(+) of the cut point.
# calculate a live gate
## Example of a quantile gate
## visualize
live_viz <- ggcyto(
gs,
subset = "singlets",
aes(x = "Live", y = "FSC-A")
) +
geom_hex(bins = 128)+
facet_wrap(~name)
# live_viz
# add gate using gate_quantile
gs_add_gating_method(
gs,
alias = "live",
pop = "-",
parent = "singlets",
dims = "U450-A",
gating_method = "gate_quantile",
gating_args = list(
probs = 0.95
)
)
# add gate using gate_mindensity2
gs_add_gating_method(
gs,
alias = "live_mindensity",
pop = "-",
parent = "singlets",
dims = "U450-A",
gating_method = "gate_mindensity2",
gating_args = list(
max = 100,
gate_range = c(50, 75)
)
)
# visualize the 2 extimated gates
live_viz+
geom_gate(gs_pop_get_gate(gs, "live"), colour = "red")+
geom_gate(gs_pop_get_gate(gs, "live_mindensity"), colour = "blue")We can also use the cutpoints to estimate a rectangular gate!
# conventional T cells
t_cell_vis <- ggcyto(
gs, subset = "live",
aes(x = "TCR Va7_2", y = "CD161")
) + geom_hex(bins = 256)+
facet_wrap(~name)
# visualize
# t_cell_vis
# estimate and add
gs_add_gating_method(
gs,
alias = "MAIT Cells",
pop = "++",
parent = "live",
dims = "G660-A,V710-A",
gating_method = "gate_quantile",
gating_args = list(
probs = 0.95,
min = 50,
max = 200
)
)...
done
# visualize
t_cell_vis+
geom_gate(gs_pop_get_gate(gs,"MAIT Cells"))Estimating an Ellipsoid gate
gate_flowclust_2d can be used to identify and generate an Ellipsoid gate in a semi-supervised manner.
cd3_ellipse <- ggcyto(gs,
subset = "live",
aes(x = "CD4", y = "CD3"))+
geom_hex(bins = 256)
# estimate and add
gs_add_gating_method(
gs,
alias = "CD3+",
pop = "+",
parent = "live",
dims = "U785-A,V510-A",
gating_method = "gate_flowclust_2d",
gating_args = list(
K = 3,
target = c(100,150),
quantile = 0.9,
plot = FALSE
)
)
# visualize
cd3_ellipse+
geom_gate(gs_pop_get_gate(gs,"CD3+"))Exercise
- How would you add a gate called
CD4+ T Cellsdefined as: CD4+ and CD3+?
Hint: Use the visualization above to provide an appropriate target. gating_argsabove indicatesK = 3, quantile = 0.9, plot = FALSE. Try running the code block by indicatingplot = TRUE. What is the outcome? How can you run this code block again? Hint: try runninghelp.search(pattern = "remove gating").
Quad gate
When there are population(s) of interest in all 4 quadrants, it is useful to estimate a quadrant gate using gate_quad_sequential. Note: Pay attention to alias and pop arguments!
# plot subsets
t_subsets <- ggcyto(
gs,
subset = "CD3+",
aes(x = "CD45RA", y = "CCR7")
)+
geom_hex(bins = 256)+
facet_wrap(~name)
# visualize
# t_subsets
# estimate and add
gs_add_gating_method(
gs,
alias = "*",
pop = "+/-+/-",
parent = "CD3+",
dims = "CD45RA,CCR7",
gating_method = "gate_quad_sequential",
gating_args = list(
gFunc = "mindensity"
),
collapseDataForGating = TRUE
)Exercise
- How would you confirm that the attached gates have correct boundaries? Hint: Refer to the section on visualization.
- How would you remove the gates attached by
gs_add_gating_method? Hint: Try runninghelp(gs_add_gating_method)and read the details.gs_remove_gating_methodreverses the the results ofgs_add_gating_methodin a step wise manner. - Are there other arguments for
gs_add_gating_method? How can you use them?
Note: gs_add_gating_method keeps a history of the calls made. If you reload a previously saved GatingSet via load_gs to work on, start with a clean history using gs_add_gating_method_init(gs) so that errors do not arise.
Explicit construction of gates
Suppose you would like to add a gate that has a particular shape or range that cannot be easily estimated using the automatic estimation approaches above. In this case, you could programmatically define a gate by either providing ranges, coordinates, or logical statements as demonstrated below.
Rectangular gate
As the name suggests, we can explicitly define a rectangular gate by by defining a matrix where the columns are the ranges for markers in x-axis and y-axis. Only 1 dimension can also be provided. In this scenario, the constructed gate filter the events marginated over the dimension provided.
## Example of rectangleGate
cd3_vis <- ggcyto(
gs, subset = "root", aes(x = "CD3", y = "CD56")
)+
geom_hex(bins = 256)+
facet_wrap(~name)
# before
# cd3_vis
# using rectangle gate to add T cell gate
cd3_rectangle <- matrix(
c(140, 205, 0, 200),
nrow = 2,
ncol = 2,
byrow = F,
dimnames = list(
c("min", "max"), # rownames
c("V510-A","U570-A") # colnames are channel names
)
)
cd3_rectangle_gate <- rectangleGate(
.gate = cd3_rectangle,
filterId = "CD3+ T cells"
)
cd3_vis+geom_gate(cd3_rectangle_gate)When a single dimension is provided:
# provide the range
cd3_range <- matrix(data = c(140,200))
# set the column name to the required channel
colnames(cd3_range) <- "V510-A"
# construct gate
cd3_range_gate <- rectangleGate(
.gate = cd3_range,
filterId = "CD3+ Range"
)
cd3_vis + geom_gate(cd3_range_gate)To add this gate we call gs_pop_add like so:
gs_pop_add(
gs,
parent = "live",
gate = cd3_range_gate
)[1] 13
Each call to gs_pop_add requires a recompute(gs) call.
Exercise
- Try to run the following code:
no_recompute <- ggcyto(gs,
subset = "CD3+ Range",
aes(x = "CD3", y = "CD4"))+
geom_hex(bins = 256)
no_recomputeWhy do you get an error?
gs_add_pop attaches the gates to the GatingSet, the index of the filtered events are not calculated until you call recompute.
Polygon gate
Similar to Rectangular gate we can define a Polygon gate by generating a matrix of vertices.
Here, each row of the matrix is the coordinate for a vertex and the columns are the channels.
## Example of polygonGate
nkt_vis <- ggcyto(gs, subset = "root",
aes(x = "CD1d", y = "CD3"))+ # fuzzy matching of marker names
geom_hex(bins = 256)+
facet_wrap(~name)Warning in getChannelMarker(frm, dim): CD1d is partially matched with
R670-ACD1d:PBS57 tet APC
# visualize
# nkt_vis
# define coordinates
## coordinates are based on visualization!
nkt_poly <- matrix(
c(
115,140, # are arranged as x,y pair
150,150,
150,180,
200,180,
200,140
),
ncol = 2,
byrow = T, # indicates that add as x,y pair
dimnames = list(
NULL, # rownames have no meaning in polygonGates
c("R670-A","V510-A") # colnames are channel names
)
)
# create a gate
nkt_poly_gate <- polygonGate(
nkt_poly,
filterId = "NKT cells"
)
# visualize
nkt_vis + geom_gate(nkt_poly_gate)Editing a gate after construction is straight forward.
# move up and scale
nkt_poly_gate_scale <- flowCore::transform_gate(
nkt_poly_gate, # gate object
dx = 1, # which dimension to shift and by how much
# dy = 1,
# scale = 2 # scales both dimensions equally
scale = c(1.05,1.05) # individually scale each dimension
)
nkt_vis+geom_gate(nkt_poly_gate_scale)Exercise
- Construct a new
Rectangular gatethat can be used to filter CD4- CD3+ population. Hint: Visualize the data and then define the best ranges. - How would you rotate this gate? Hint: try running
help(flowCore::transform_gate). - Think back to the
CD3+gate we created usinggs_add_gating_method, how would you edit it? Hint: try runninghelp.search("get or set gate").
Boolean Gates
We can use booleanFilter to simply negate a gated population! The resulting filter does not have a geometric representation.
# add not MAIT gate
## Example of booleanFilter
not_mait <- booleanFilter(`!MAIT Cells`, filterId = "not_MAIT")
# add Boolean gate
gs_pop_add(
gs,
not_mait,
parent = "live"
)[1] 14
recompute(gs)
# visualize
t_cell_vis+
geom_overlay(
gs_pop_get_data(gs, y = "not_MAIT"),
size = 0.3,
colour = "red"
)Note: We are also able to combine multiple gates to generate a booleanFilter.
Let’s create by combining not_MAIT and MAIT Cells
# using an OR ('|') operator
all_cells <- booleanFilter(
`not_MAIT|MAIT Cells`,
filterId = "All"
)
# add population
gs_pop_add(gs, all_cells, parent = "live")
recompute(gs)Exercise
- How would you create a
booleanFilterthat only filters out naive T cells defined as CD45RA+CCR7+? Can you visualize the result? - How would you remove the gate called
All? - How would you remove an attached gate? Is
recomputerequired after removal of a gate? - What happens if you remove a parent gate?
Visualizing the Gating Hierachy
We have been creating and adding many gates to our GatingSet. A useful way to visualize all the gates and their relationship is by plotting the gating Hierarchy as a tree. This gives us an immediate summary of what nodes are present in our GatingSet.
# plot the gating tree
plot(gs, bool = TRUE)Another useful approach is to visualize the gated data, with gates that we have generated. See the section of visualization for more.
# visualize the full gating hierarchy
autoplot(
gs[[1]],
bins = 256,
bool = TRUE)+
ggcyto_par_set(limits = "data") # set data range to be determined by dataSaving your work
Finally, we save your GatingSet. The GatingSet can be loaded back into R using load_gs("path/to/a/folder").
# save your work
save_gs(gs, path = "path/to/a/folder")Conclusion
In this section, we spent some time identifying various types of gates that are available in cytoverse. As well, we demonstrated how to programmatically create such gates, either manually (i.e. by defining ranges or vertices) or in a semi-automated and data driven manner.
Next, we go over methods to extract the gated data.
The (Optional) Part 3 goes over how users can leverage a gatingTemplate to generate gates as we have done here, but with minimal scripting. It is worth noting that while the use gatingTemplate minimizes scripting, it does not compromise on reproducibility!
Lastly, we also made ample use of the ggcyto library in order to generate visualizations that helped in QC’ing the data and the gates that we generated. We will go over ggcyto in more detail in Visualizations.